Glr* : a Robust Parser for Spontaneously Spoken Language

نویسنده

Alon Lavie

چکیده

This paper describes GLR*, a parsing system based on Tomita's Generalized LR parsing algorithm, that was designed to be robust to two particular types of extra-grammaticality: noise in the input, and limited grammar coverage. GLR* attempts to overcome these forms of extra-grammaticality by ignoring the unparsable words and fragments and conducting a search for the maximal subset of the original input that is covered by the grammar. The parser is coupled with a beam search heuristic, that limits the combinations of skipped words considered by the parser, and ensures that the parser will operate within feasible time and space bounds. The developed parsing system includes several tools designed to address the diiculties of parsing spontaneous speech: a statistical disambiguation module, an integrated heuristic for evaluating and ranking the parses produced by the parser, and a parse quality heuristic, that allows the parser to self-judge the quality of the parse chosen as best. To evaluate its suitability to parsing spontaneous speech, the GLR* parser was integrated into the JANUS speech translation system. Our evaluations on both transcribed and speech recognized input have indicated that the version of the system that uses GLR* produces about 30% more acceptable translations, than a corresponding version that uses the original non-robust GLR parser.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

JANUS: a Multi-lingual Speech-to-speech Translation System for Spontaneously Spoken Language in a Limited Domain

Janus is a multilingual speech translation system currently operating in the domain of meeting scheduling. Translating spontaneous speech requires a high degree of robustness to overcome the dissuencies of spoken language as well as errors in speech recognition. In this system description, we focus on the robust speech translation components in Janus|the skipping GLR* parser, the segmentation o...

متن کامل

GLR* : A Robust Grammar-Focused Parser for Spontaneously Spoken Language

The analysis of spoken language is widely considered to be a more challenging task than the analysis of written text. All of the difficulties of written language can generally be found in spoken language as well. Parsing spontaneous speech must, however, also deal with problems such as speech disfluencies, the looser notion of grammaticality, and the lack of clearly marked sentence boundaries. ...

متن کامل

PROFER: predictive, robust finite-state parsing for spoken language

The natural languageprocessingcomponentof a speechunderstanding system is commonly a robust, semantic parser, implemented as either a chart-based transition network, or as a generalized leftright (GLR) parser. In contrast, we are developing a robust, semantic parser that is a single, predictive finite-state machine. Our approach is motivated by our belief that such a finite-state parser can ult...

متن کامل

Multi-lingual Translation of Spontaneously Spoken Language in a Limited Domain

JANUS is a multi-lingual speech-tospeech translation system designed to facilitate communication between two parties engaged in a spontaneous conversation in a limited domain. In an attempt to achieve both robustness and translation accuracy we use two di erent translation components: the GLR module, designed to be more accurate, and the Phoenix module, designed to be more robust. We analyze th...

متن کامل

Comparative Study of GLR Parser with Finite-state Predictors and Chart-based Semantic Parsers

The natural language processing component of a speech understanding system is commonly a robust, semantic parser, implemented as either a chart-based transition network, or as a generalized left right (GLR) parser. In contrast, we are developing a robust, semantic parser that is a single, predictive finite-state machine. Our approach is motivated by our belief that such a finite-state parser ca...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1996

Glr* : a Robust Parser for Spontaneously Spoken Language

نویسنده

چکیده

منابع مشابه

JANUS: a Multi-lingual Speech-to-speech Translation System for Spontaneously Spoken Language in a Limited Domain

GLR* : A Robust Grammar-Focused Parser for Spontaneously Spoken Language

PROFER: predictive, robust finite-state parsing for spoken language

Multi-lingual Translation of Spontaneously Spoken Language in a Limited Domain

Comparative Study of GLR Parser with Finite-state Predictors and Chart-based Semantic Parsers

عنوان ژورنال:

اشتراک گذاری